Methods and apparatus for controlling a user interface based on the emotional state of a user

ABSTRACT

Methods and apparatus for modifying a user interface as a function of the detected emotional state of a system user are described. In one embodiment, stress analysis is performed on received speech to generate an emotional state indicator value, e.g., a stress level indicator value. The stress level indicator value is compared to one or more thresholds. If a first threshold is exceeded the user interface is modified, e.g., the presentation rate of speech is slowed. If a second threshold is not exceeded, another modification to the user interface is made, e.g., the speech presentation rate is accelerated. If the stress level indicator value is between first and second thresholds, user interface operation continues unchanged. The user interface modification techniques of the present invention may be used in combination with known knowledge or expertise based user interface adaptation features.

RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 10/200,898, filed on Jul. 22, 2002 and entitled “METHODS ANDAPPARATUS FOR CONTROLLING A USER INTERFACE BASED ON THE EMOTIONAL STATEOF A USER” and which is hereby expressly incorporated by reference inits entirety.

FIELD OF THE INVENTION

The present invention relates to user interfaces and, more particularly,adaptive user interfaces which adapt in response to, e.g., acharacteristic or state of a user.

BACKGROUND OF THE INVENTION

Electronic systems including computer systems that interact with humanbeings use what is commonly known as a user interface to control one ormore aspects of machine/human interaction. User interfaces have providedformidable challenges to the user interface developers who normally seekto provide an interface which is both easy to use and efficient from auser's perspective. That is, a user interface should allow a user toaccomplish a desired objective, e.g., receive a desired set ofinformation or make a desired selection, without a lot of unnecessaryuser operations and without having to have menu options repeated, e.g.,because of a lack of understanding upon initial presentation. It hasbeen recognized that to provide an effective and efficient userinterface, the interface should have flexibility and be able to adapt tothe individual user.

The decision process of modifying the user interface to adapt to anindividual user can be extremely difficult because the adaptationprocess attempts to customize the interface for a particular user oftenwithout specific knowledge of a user's current condition. A user'scurrent condition may include such things as a user's intentions, goals,and/or informational needs, e.g., how experienced the user is with thesystem and/or what the user's knowledge base is.

Known user interfaces have focused on attempting to determine a user'sexperience level and/or goals and then modify the user interfaceaccordingly. For example, users who are determined to be experts basedon their amount of previous experience with a system may be presentedless help when traversing menus and options than users who aredetermined to be novices at using an apparatus. Similarly, when a user'sgoal is determined to be a particular operation or one of a set ofoperations, e.g., based on previous menu selections, a user may bepresented with specific menus tailored or arranged to facilitate theuser's goal.

Such known systems focus on modifying a user interfaces fail to considerthe impact of a user's emotional state at the time of using anapparatus. The inventor of the present application recognized that auser's emotional state can have an impact on a user's ability tointerpret information provided to the user, make menu selections andperform other tasks commonly performed through use of a user interface.For example, when under extreme stress: angry, grieving, etc. a systemuser may find it more difficult to interpret and understand machinegenerated speech, menu options, etc., than under less stressfulconditions.

Thus, it is very possible for a user to be fairly knowledgeable about auser interface on a particular system, but still be impaired in hiscapacity to interact with the user interface due to the presence ofemotions, e.g., anxiety, grief, anger, etc. Such impairment can beeither long or short term in nature.

In view of the above discussion, it becomes apparent that it would bedesirable for a user interface to adapt in response to a user'semotional state. Accordingly, there is a need for methods and apparatusthat allow a user interface for any product or service to detect theemotional state of a user and to be modified based upon a user's currentemotional state (e.g., stress level). Furthermore, it would be desirablethat at least some new user interfaces be able to make user interfacemodifications in response to a user's emotional state while also makingmodifications based on other known factors such as experience level andknowledge.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a system implemented in accordance with theinvention.

FIG. 2 illustrates a computer system responsible for performing stressanalysis on speech in accordance with the invention.

FIG. 3, which comprises the combination of FIGS. 3A and 3B, is a flowdiagram showing the steps of a method of the present invention.

FIG. 4, which comprises the combination of FIGS. 4A and FIG. 4B, is aflow diagram showing a particular exemplary embodiment of the method ofthe present invention.

SUMMARY OF THE INVENTION

The present invention is directed to user interfaces and methods andapparatus for modifying a user interface as a function of the detectedemotional state of a system user. The emotional state of the user maybe, e.g., how stressed the user of the system is. The user's stresslevel may be detected, e.g., from received speech.

In one particular exemplary embodiment, stress analysis is performed onreceived speech to generate an emotional state indicator value, e.g., astress level indicator value. The stress level indicator value iscompared to one or more thresholds. If a first threshold is exceeded,e.g., indicating a highly stressed emotional state, the user interfaceis modified, e.g., the presentation rate of speech is slowed. If asecond threshold is not exceeded, e.g., indicating a low stress orrelaxed emotional state, an alternative modification to the userinterface is made, e.g., the speech presentation rate is accelerated. Ifthe stress level indicator value is between first and second thresholds,user interface operation continues unchanged.

In some embodiments, the emotional state indicator value is monitoredover a period of time. If the emotional state indicator value continuesto indicate a high level of stress, e.g., by a threshold being exceededfor a preselected period of time, despite user interface modifications,additional action may be taken such as transferring the user to a humanoperator, such as a customer service representative or 911 telephoneoperator. The transferring of a caller may be viewed as a modificationof the user interface since it may involve transitioning from a fullyautomated interface to an interface which relies on at least some humanoperator involvement.

The user interface modification techniques of the present invention maybe used in combination with known knowledge or expertise-based userinterface adaptation features. Various exemplary modifications to a userinterface which are made, alone or in combination, when a high level ofstress is detected include:

-   a.) Adding reassuring voice prompts to be output for the user to    hear e.g. “Good”, “Well Done!”, “That's right”;-   b) Using lengthier, more explanatory and helpful prompting., e.g.    instead of asking only “Limit order or market order?”, the system    can explain, “Do you want a limit order or a market order? A limit    order allows you to buy or sell a stock at a price you name, whereas    a market order buys or sells a stock at the current market price.”;-   c) Modifying the rate or speed at which voice prompts are played to    the user so that the voice prompts are played more slowly and    intelligibly to the user;-   d) Adding or increasing the duration of pauses between sentences or    phrases so as to increase the intelligibility of the voice prompts    to the user;-   e) Breaking down sets of tasks which the user must perform in the    course of a given application into smaller parts; e.g., instead of    asking a user “Type of order?”, the application might ask the    stressed user “Buy or sell?” in one step, then “Market or limit    order?” in another step, and “All-or-none or partial fill order?” in    a third step, etc., and-   f) Modify voice prompts played to the user so that they are played    in a different voice; e.g., female rather than a male voice or a    child's voice rather than adult's voice.

When a low level of stress is detected, the inverse of one or more ofthe above described modifications may be performed, e.g., thepresentation rates of voice prompts may be increased, the duration ofpauses reduced in speech generated by the apparatus, etc.

While described in the context of voice analysis and stress detection,it is to be understood that the user interface modification technique isnot limited to the emotional condition of stress and that modificationsof the user interface based on other emotional states and indicators ofemotional states is possible. For example, facial expressions of a usermay be monitored using a camera and the input from the camera analyzedto generate an indicator of the user's emotional state. For example, ifit is determined that the user is smiling it might be interpreted thatthe user is happy or contented while a detected frown might beinterpreted as indicating that the user is unhappy and that interfacemodification of the type discussed above in the case of a stressedemotional state may be called for.

The user interface modification techniques of the present invention arewell suited for use in a wide range of user interface applicationsincluding, e.g., personal computer applications, telephone operatorapplications, automated telephone directory applications, control systemapplications, etc.

DETAILED DESCRIPTION

FIG. 1 illustrates a system 100 including an adaptive user interfaceimplemented in accordance with the invention for measuring voice stressduring speech recognition and adapting a user interface in accordancewith the present invention. The system 100 includes routines andhardware for call processing, for performing the communicationsoperations, for performing speech recognition and analysis, and forperforming interface modifications as a function of a measured stresslevel. The system supports communications between various locations viathe public switched telephone network (PSTN) 106 and is capable ofperforming stress analysis on received speech. The system supports voiceinput for analysis from various locations.

The system 100 includes first and second locations 102, 104, coupledtogether by a communications network, PSTN 106. Location 1, which maybe, e.g., a first customer premise such as a home or office, includescommunication equipment, e.g., a telephone 110, and a person 108. Thetelephone 110 is coupled to a telephone switch 118 of the PSTN 106 via acommunications channel 103 which may be, e.g., a telephone line.

Location 2 104 includes additional communication equipment, e.g., asecond telephone 114, a computer system 116, and a person 112. Both thesecond telephone 114 and the computer system 116 are coupled to thetelephone switch 118 of PSTN 106 via communication channel 105.Accordingly, both the telephone 114 and the computer system 116 canreceive and transmit audio signals including speech via PSTN 106.Individuals 108 and 112 represent two potential sources of speech. Aswill be discussed further below, computer 116 performs stress analysison speech reached via telephone line 105 or from a local speech sourceand modifies a user interface in accordance with the invention based ondetected stress levels.

FIG. 2 shows the computer system 116 in greater detail. The computersystem 116 includes a memory 202, a central processing unit (CPU) 206,an I/O interface 212, a sound/speech processing card 216, a telephonycard 218, and a microphone 220. The memory 202, the CPU 206, the I/Ointerface 212, the sound/speech processing card 216, and the telephonycard 218 are coupled together by a bus 204 as shown. An output device208 and an input device 210 are coupled to the I/O interface 212, whichperforms signal conversion and other interfacing functions allowing thedevices coupled to bus 204 to interact with I/O devices 208, 210,respectively. Output device 208 may be, e.g., a display and/or speakerwhile input device 210 may be, e.g., a keyboard, a camera, a sounddetection device, etc. Microphone 220, which serves as one exemplarysource of audio input, e.g., speech, is coupled to the sound/speechprocessing card 216. The sound/speech processing card 216 can alsoreceive audio input from, e.g., telephony card 218 via bus 204. Thus,sound/speech processing card 216 can receive and process either localaudio signals or signals received by the telephony card 218 from thePSTN 106. Thus a user may be local or remotely located.

The memory 202 includes a plurality of modules including an operatingsystem 222, an interactive voice response (IVR) module 224, a speechrecognition module 226, a stress analysis module 228, and a stress-basedinterface control module 230. While shown as part of the memory 202, itis to be understood that each of the modules 222, 224, 226, 228, 230 canbe implemented using hardware, software, or a combination ofhardware/software. In the case of a hardware/software combination, thesoftware portion of the module would be stored in memory 202 with thehardware being included elsewhere in the system 116. The memory 202 mayalso be used for storing audio signals 223, e.g., to be processed by thesound/speech processing card 216. The operating system 222 performs theoverall control and management of the computer hardware and interactswith the IVR module 224, the speech recognition module 226, the stressanalysis module 228, and the stress-based interface control module 230to facilitate their operation. The IVR module 224 may be implementedusing standard commercial interactive voice response software operatingalone or in combination with speech recognition module 226. Speechrecognition module 226 may be implemented using known speech recognitionsoftware configured in accordance with the present invention to performspeech processing functions which may include speech recognition and topass appropriately formatted speech samples to the stress analysismodule 228. The stress analysis module 228 receives formatted samples ofthe user's speech as input and outputs a numerical quantification of theuser's stress level. The stress analysis module 228 may be implementedusing commercially available truth-verification software packages, e.g.,TrusterPro by Trustech Innovative Technologies LTD. of Tel Aviv, Israel,which are capable of outputting such stress indicator levels. Thestress-based interface control module 230 receives the numerical outputfrom the stress analysis module 228, e.g., one for each sampled speechframe, and modifies the user interface as a function of the stress levelindicator value, thereby adapting the interface as a function of theuser's current stress level as detected by stress analysis module 228.

FIG. 3 shows the steps of an exemplary method of the present invention,which may be performed by the computer system of FIG. 2. Operationbegins with start step 302 with the computer system being powered up andexecuting various modules stored in its memory 202. Operation proceedsfrom start step 302 to 304, wherein the computer system 116 receivesspeech input. The speech input may be obtained from the sound/speechprocessing card 216 via the microphone 220 or the telephony card 218.The received speech is sampled in step 306 by the sound/speechprocessing card 216, e.g., under direction of the speech recognitionmodule 226. In step 306, the sampled speech is processed to produce setsof sampled speech, e.g., frames. Each set of sampled speech produced instep 306, is subject to further processing beginning in step 308. Instep 308, the stress analysis operation is performed by the stressanalysis module 228, and a stress level indicator value is generated.One such value may be generated for each frame or for a set of frames,e.g., 5 frames. Over time, multiple stress level indicator values aregenerated. The stress level indicator value generated in step 308 isanalyzed in step 310, e.g., by stress-based interface control module230, to determine if it exceeds the first threshold. If the firstthreshold is exceeded, the user interface is modified in step 312 inaccordance with the present invention. Some examples of user interfacemodifications used in various embodiments include: addition ofreassuring prompts, lengthier explanations, decrease of rate of prompts,increases in pause durations, or forwarding to a human operator.Additional examples are discussed with respect to FIG. 4.

If the first threshold for modifying the user interface is not exceeded,the stress level indicator value generated in step 308 is analyzed instep 314, e.g., by stress-based interface control module 230, todetermine if it is below a second threshold. If the stress levelindicator value is below the second threshold, the user interface ismodified in step 316 in accordance with the present invention. Someexamples of user interface modifications used in various embodimentsinclude: briefer prompts, shorter or truncated explanations, increasedrate of prompts, or decrease in pause durations. If the stress-basedindicator level is equal to or greater than the second threshold,operation proceeds to step 318, where the processing continues withoutmodification of the user interface. Operation proceeds from either steps312, 316, or 318 to step 320, where the processing of the receivedspeech segment for purposes of controlling the user interface isstopped. Processing of additional speech input in accordance with themethod of FIG. 3 may continue over time with modifications to the userinterface over time being made as a function of detected stress level.

FIG. 4 shows a particular embodiment of the invention as a stockbrokeranalysis. Operation begins with start step 402 with the computer systembeing powered up and the execution of various modules stored in itsmemory. Operation proceeds from start step 402 to 404. In step 404, thecomputer system 116 prompts the person (user) 108 at location 1 102 orthe person (user) at location 2 104 with a message e.g. “What Stock DoYou Want to Trade?” In the case where said person is located at location1 102, the prompt is transmitted via PSTN 106. In the case where theuser of the system is present at location 2 104, the generated speechwould be the output of a local output device 208, e.g., speaker. In thenext step 406, the person (user) 108 or 112 speaks and responds with ananswer, e.g. “Verizon”. In the case where said person is located atlocation 1 102, the speech is received by telephone 110 at location 1and transmitted via PSTN 106 to the telephony card 218 of computersystem 116 at location 2 104. In the case where the user of the systemis present at location 2 104, the speech is received by the computersystem 116 via either a microphone 220 or a telephone 114 connected totelephony card 218.

The speech input obtained in step 406 is input to steps 408 and 410. Instep 408, the speech recognition module 226 recognizes the user speechas the word “Verizon” and proceeds in step 412 to access from itsdatabase “Verizon” stock statistics e.g. current price, Bid/Ask, tradingvolume, Day High, Day Low, 52 week high, 52 week low, and playsstatistics back to person (user) 108 or 112. In step 410, the systemrecords user speech in an appropriate file format in memory 202, asrequired by the stress (truth) analysis module 228. In the next step414, the system calls the stress (truth) analysis module 228 using thespeech file generated in step 410 as input. The flow proceeds to step416, where the stress (truth) analysis module 228 uses its proprietaryalgorithm on the speech file and outputs a numerical measure of stress,the stress level indicator value. The stress level indicator valueobtained in step 416 is evaluated in step 418 by the stressed basedinterface control module 230. If the stress level indicator value isgreater than the first threshold value, the flow proceeds to step 420.In step 420, the stress-based interface control module 230 modifies theuser interface to adapt to the stressed user.

Examples of adaptations to the user interface that can be made include:

-   a.) Reassuring voice prompts can be added to the output for the user    to hear e.g. “Good”, “Well Done!”, “That's right”.-   b) Lengthier, more explanatory and helpful prompting can be    outputted e.g. instead of asking only “Limit order or market order?”    the system can explain, “Do you want a limit order or a market    order? A limit order allows you to buy or sell a stock at a price    you name, whereas a market order buys or sells a stock at the    current market price.”-   c) Rate, or speed, at which the voice prompts are played to the user    can be slowed, so that the voice prompts are played more slowly and    intelligently to the user.-   d) Pausing between sentences can be increased, so as to increase the    intelligibility of the voice prompts to the user.-   e) A set of tasks which the user must perform in the course of a    given application may be broken down into smaller parts; e.g.    instead of asking a user “Type of order?”, the application might ask    the stressed user “Buy or sell?” in one step, then “Market or limit    order?” in another step, and “All-or-none or partial fill order?” in    a third step, etc.-   f) Voice prompts played to the user can be played in a different    voice; e.g. female rather than male voice, or a child's voice rather    than adult's voice.-   g) For very high stress levels, with no detected lowering of the    stress level indicator value through automated computer responses,    the user could be switched to a human phone operator.-   h) Unlimited variations in programming logic can direct the stressed    user down a different path than the non-stressed user.

If the first threshold for modifying the user interface is not exceeded,the stress level indicator value generated in step 416 is analyzed instep 422, e.g., by stress-based interface control module 230, todetermine if it is below a second threshold. If the stress levelindicator value is below the second threshold, the user interface ismodified in step 424 in accordance with the present invention to adaptto the less stressed user.

Examples of adaptations to the user interface that can be made in thecase of a low level of stress as indicated by the stress level indicatorbeing below the second threshold include:

-   a) Reduce the number of reassuring voice prompts utilized.-   b) Briefer, abbreviated prompting can be outputted e.g. instead of    asking “Do you want a limit order or a market order? A limit order    allows you to buy or sell a stock at a price you name, whereas a    market order buys or sells a stock at the current market price.” the    system could prompt only, “Do you want a limit order or a market    order?”-   c) Rate, or speed, at which the voice prompts are played to the user    can be accelerated, so that the voice prompts are played more    rapidly for the more comfortable user.-   d) Pausing between sentences can be decreased, so as to decrease the    total time required for the transaction.-   e) A set of tasks which the user must perform in the course of a    given application may be consolidated; e.g. instead of asking a user    “Buy or sell?” in one step, then “Market or limit order?” in another    step, and “All-or-none or partial fill order?” in a third step,    etc., the application might simply ask the less stressed more    comfortable user simply “Type of order?” in one step.-   f) Unlimited variations in programming logic can direct the less    stressed user down a different path than the more-stressed user.

If the stress-based indicator level is equal to or greater than thesecond threshold, operation proceeds to step 426, where the processingcontinues without modification of the user interface. Operation proceedsfrom either steps 420, 424, or 426 to step 428, where the processing ofthe received speech segment for purposes of controlling the userinterface is stopped.

Processing of additional speech input in accordance with the method ofFIG. 4 may continue over time with modifications to the user interfacebeing made as a function of detected stress level. For example, in oneembodiment the stress level indicator value detected during each of adifferent plurality of time periods is monitored to determine if theuser continues to suffer from a highly stressed emotional state asindicated by the stress level indicator value remaining above aparticular set threshold for a period of time. In one such embodiment,where the user interface is part of an emergency call servicingapparatus, the user is transferred to a human operator in response tocontinuing to detect a highly stressed emotional state despiteimplementing one or more of the above described user interfacemodifications.

Numerous variations on the above described user interface technique arepossible. For example user interface modifications can, and in at leastone embodiment are, based on the detected experience level or expertiseof a user in addition to the user's detected emotional state. In someembodiments images as opposed to speech are used as the input from whicha user's emotional state are determined. Other embodiments fallingwithin the scope of the invention will be apparent to those skilled inthe art based on the above detailed description and figures included inthe present application.

What is claimed is:
 1. A method controlling an apparatus, the methodcomprising the steps of: generating an emotional state indicator valueby performing stress analysis on received user speech; comparing saidemotional state indicator value to a first threshold to determine ifsaid emotional state indicator value is above said first threshold;comparing said emotional state indicator value to a second threshold todetermine if said emotional state indictor value is below said secondthreshold; and modifying a user interface in response to determiningthat said emotional state indicator value is above said first thresholdor below said second threshold.
 2. The method of claim 1, whereinmodifying a user interface includes slowing the rate at which promptsare provided to a user of said apparatus when said emotional stateindicator value is above said first threshold.
 3. The method of claim 1,wherein modifying said user interface includes, when said emotionalstate indicator value is determined to be below said second threshold,speeding up the rate at which prompts are provided to a user of saidapparatus.
 4. The method of claim 1, further comprising: leaving saiduser interface unmodified when said emotional state indicator value isbetween said first and second thresholds.
 5. The method of claim 1,wherein the step of modifying a user interface includes: altering thelength of a pause between sentences included in speech generated by saiduser interface.
 6. The method of claim 1, wherein the step of modifyinga user interface includes: changing the number of options presented to auser as part of a menu from which the user can make a selection.
 7. Themethod of claim 1, wherein the step of modifying a user interfaceincludes: changing the rate at which prompts are provided to a user ofsaid apparatus; changing the length of a pause between sentencesincluded in speech generated by said user interface; and changing thenumber of options presented to a user as part of a menu from which theuser can make a selection.
 8. The method of claim 1, further comprisingthe steps of: periodically performing said generating step using speechinput received during different periods of time to generate a set ofemotional state indicator values, each emotional state indicator valuecorresponding to a different period of time; and monitoring saidemotional state indicator values to determine if the generated emotionalstate indicator values stay above said second threshold for apredetermined period of time.
 9. The method of claim 8, wherein saidapparatus is a telephone call servicing apparatus, the method furthercomprising the step of: connecting a caller from which said speech inputis received to a human service representative when it is determined thatthe generated emotional state indicator values stay above said secondthreshold for said predetermined period of time.
 10. The method of claim9, wherein said telephone call servicing apparatus is a telephoneoperator terminal.
 11. An apparatus used to interact with a user of saidapparatus, the apparatus comprising: a stress analysis module configuredto generate an emotional state indicator value by performing stressanalysis on received user speech; a control module configured to:compare said emotional state indicator value to a first threshold todetermine if said emotional state indicator value is above said firstthreshold; compare said emotional state indicator value to a secondthreshold to determine if said emotional state indictor value is belowsaid second threshold; and modify a user interface in response todetermining that said emotional state indicator value is above saidfirst threshold or below said second threshold.
 12. The apparatus ofclaim 11, wherein said control module is further configured to, inmodifying a user interface, slow the rate at which prompts are providedto a user of said apparatus when said emotional state indicator value isabove said first threshold.
 13. A machine-readable digital data storagedevice comprising a set of machine executable instructions forcontrolling a machine to perform the steps of: generating an emotionalstate indicator value by performing stress analysis on received userspeech; comparing said emotional state indicator value to a firstthreshold to determine if said emotional state indicator value is abovesaid first threshold; comparing said emotional state indicator value toa second threshold to determine if said emotional state indictor valueis below said second threshold; and modifying a user interface inresponse to determining that said emotional state indicator value isabove said first threshold or below said second threshold.
 14. Themachine-readable digital data storage device of claim 13, wherein themachine executable instructions for modifying a user interface includeinstructions for modifying the rate at which prompts are provided to auser of said apparatus.